Overview

Dataset statistics

Number of variables46
Number of observations5095270
Missing cells29852079
Missing cells (%)12.7%
Duplicate rows5754
Duplicate rows (%)0.1%
Total size in memory1.7 GiB
Average record size in memory368.0 B

Variable types

Categorical40
Numeric6

Alerts

NU_ANO has constant value "2019" Constant
Dataset has 5754 (0.1%) duplicate rowsDuplicates
NO_MUNICIPIO_ESC has a high cardinality: 5277 distinct values High cardinality
NO_MUNICIPIO_PROVA has a high cardinality: 1692 distinct values High cardinality
Q018 is highly correlated with NU_ANOHigh correlation
Q006 is highly correlated with NU_ANOHigh correlation
TP_ST_CONCLUSAO is highly correlated with IN_TREINEIRO and 6 other fieldsHigh correlation
IN_TREINEIRO is highly correlated with TP_ST_CONCLUSAO and 6 other fieldsHigh correlation
Q001 is highly correlated with NU_ANOHigh correlation
Q024 is highly correlated with NU_ANOHigh correlation
Q022 is highly correlated with NU_ANOHigh correlation
TP_LOCALIZACAO_ESC is highly correlated with TP_ST_CONCLUSAO and 2 other fieldsHigh correlation
Q020 is highly correlated with NU_ANOHigh correlation
Q014 is highly correlated with NU_ANOHigh correlation
Q012 is highly correlated with NU_ANOHigh correlation
Q025 is highly correlated with NU_ANOHigh correlation
Q013 is highly correlated with NU_ANOHigh correlation
TP_ENSINO is highly correlated with IN_TREINEIRO and 1 other fieldsHigh correlation
Q017 is highly correlated with NU_ANOHigh correlation
Q016 is highly correlated with NU_ANOHigh correlation
Q002 is highly correlated with NU_ANOHigh correlation
TP_SEXO is highly correlated with NU_ANOHigh correlation
SG_UF_ESC is highly correlated with TP_ST_CONCLUSAO and 3 other fieldsHigh correlation
TP_ESCOLA is highly correlated with TP_ST_CONCLUSAO and 2 other fieldsHigh correlation
Q008 is highly correlated with NU_ANOHigh correlation
falta is highly correlated with NU_ANOHigh correlation
Q015 is highly correlated with NU_ANOHigh correlation
NU_ANO is highly correlated with Q018 and 36 other fieldsHigh correlation
TP_NACIONALIDADE is highly correlated with NU_ANOHigh correlation
Q019 is highly correlated with NU_ANOHigh correlation
Q023 is highly correlated with NU_ANOHigh correlation
TP_SIT_FUNC_ESC is highly correlated with TP_ST_CONCLUSAO and 2 other fieldsHigh correlation
Q011 is highly correlated with NU_ANOHigh correlation
Q021 is highly correlated with NU_ANOHigh correlation
Q009 is highly correlated with NU_ANOHigh correlation
TP_DEPENDENCIA_ADM_ESC is highly correlated with TP_ST_CONCLUSAO and 3 other fieldsHigh correlation
Q003 is highly correlated with NU_ANOHigh correlation
Q010 is highly correlated with NU_ANOHigh correlation
TP_ESTADO_CIVIL is highly correlated with NU_ANOHigh correlation
SG_UF_PROVA is highly correlated with SG_UF_ESC and 1 other fieldsHigh correlation
Q004 is highly correlated with NU_ANOHigh correlation
Q007 is highly correlated with NU_ANOHigh correlation
TP_ENSINO has 2217073 (43.5%) missing values Missing
CO_MUNICIPIO_ESC has 3947858 (77.5%) missing values Missing
NO_MUNICIPIO_ESC has 3947858 (77.5%) missing values Missing
CO_UF_ESC has 3947858 (77.5%) missing values Missing
SG_UF_ESC has 3947858 (77.5%) missing values Missing
TP_DEPENDENCIA_ADM_ESC has 3947858 (77.5%) missing values Missing
TP_LOCALIZACAO_ESC has 3947858 (77.5%) missing values Missing
TP_SIT_FUNC_ESC has 3947858 (77.5%) missing values Missing
TP_COR_RACA has 103201 (2.0%) zeros Zeros

Reproduction

Analysis started2022-04-14 13:02:53.700755
Analysis finished2022-04-14 13:32:50.318472
Duration29 minutes and 56.62 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

NU_ANO
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
2019
5095270 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
20195095270
100.0%

Length

2022-04-14T10:32:50.578477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:50.719206image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
20195095270
100.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TP_SEXO
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
F
3031822 
M
2063448 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowF
4th rowF
5th rowF

Common Values

ValueCountFrequency (%)
F3031822
59.5%
M2063448
40.5%

Length

2022-04-14T10:32:50.836556image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:50.953032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
f3031822
59.5%
m2063448
40.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TP_ESTADO_CIVIL
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
1
4398485 
2
 
395855
0
 
206800
3
 
87894
4
 
6236

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
14398485
86.3%
2395855
 
7.8%
0206800
 
4.1%
387894
 
1.7%
46236
 
0.1%

Length

2022-04-14T10:32:53.414568image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:53.519358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
14398485
86.3%
2395855
 
7.8%
0206800
 
4.1%
387894
 
1.7%
46236
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TP_COR_RACA
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.128251103
Minimum0
Maximum5
Zeros103201
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size38.9 MiB
2022-04-14T10:32:53.669280image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q33
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.017512094
Coefficient of variation (CV)0.4780977639
Kurtosis-1.1623697
Mean2.128251103
Median Absolute Deviation (MAD)1
Skewness-0.1019156763
Sum10844014
Variance1.035330861
MonotonicityNot monotonic
2022-04-14T10:32:53.820173image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
32364063
46.4%
11831779
36.0%
2648309
 
12.7%
4116162
 
2.3%
0103201
 
2.0%
531756
 
0.6%
ValueCountFrequency (%)
0103201
 
2.0%
11831779
36.0%
2648309
 
12.7%
32364063
46.4%
4116162
 
2.3%
531756
 
0.6%
ValueCountFrequency (%)
531756
 
0.6%
4116162
 
2.3%
32364063
46.4%
2648309
 
12.7%
11831779
36.0%
0103201
 
2.0%

TP_NACIONALIDADE
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
1
4950021 
2
 
124332
4
 
8233
0
 
6749
3
 
5935

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
14950021
97.1%
2124332
 
2.4%
48233
 
0.2%
06749
 
0.1%
35935
 
0.1%

Length

2022-04-14T10:32:53.994554image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:54.111785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
14950021
97.1%
2124332
 
2.4%
48233
 
0.2%
06749
 
0.1%
35935
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TP_ST_CONCLUSAO
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
1
2992921 
2
1465895 
3
616668 
4
 
19786

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
12992921
58.7%
21465895
28.8%
3616668
 
12.1%
419786
 
0.4%

Length

2022-04-14T10:32:54.299779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:54.418596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
12992921
58.7%
21465895
28.8%
3616668
 
12.1%
419786
 
0.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TP_ESCOLA
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
1
3629376 
2
1247262 
3
 
218632

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
13629376
71.2%
21247262
 
24.5%
3218632
 
4.3%

Length

2022-04-14T10:32:54.574223image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:54.674416image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
13629376
71.2%
21247262
 
24.5%
3218632
 
4.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TP_ENSINO
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing2217073
Missing (%)43.5%
Memory size38.9 MiB
1.0
2864398 
2.0
 
13799

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.02864398
56.2%
2.013799
 
0.3%
(Missing)2217073
43.5%

Length

2022-04-14T10:32:54.842090image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:54.945508image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.02864398
99.5%
2.013799
 
0.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

IN_TREINEIRO
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
0
4478602 
1
616668 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
04478602
87.9%
1616668
 
12.1%

Length

2022-04-14T10:32:55.113877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:55.299356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
04478602
87.9%
1616668
 
12.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

CO_MUNICIPIO_ESC
Real number (ℝ≥0)

MISSING

Distinct5544
Distinct (%)0.5%
Missing3947858
Missing (%)77.5%
Infinite0
Infinite (%)0.0%
Mean3128667.27
Minimum1100015
Maximum5300108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.9 MiB
2022-04-14T10:32:55.475442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1100015
5-th percentile1501402
Q12314102
median3148004
Q33550308
95-th percentile5108402
Maximum5300108
Range4200093
Interquartile range (IQR)1236206

Descriptive statistics

Standard deviation969519.812
Coefficient of variation (CV)0.3098826843
Kurtosis-0.1314850572
Mean3128667.27
Median Absolute Deviation (MAD)542802
Skewness0.2435558394
Sum3.589870369 × 1012
Variance9.399686658 × 1011
MonotonicityNot monotonic
2022-04-14T10:32:55.705602image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
355030852736
 
1.0%
230440029738
 
0.6%
330455727441
 
0.5%
530010821075
 
0.4%
130260316141
 
0.3%
310620014396
 
0.3%
292740813330
 
0.3%
261160612181
 
0.2%
150140211949
 
0.2%
410690211580
 
0.2%
Other values (5534)936845
 
18.4%
(Missing)3947858
77.5%
ValueCountFrequency (%)
1100015158
 
< 0.1%
1100023716
< 0.1%
110003124
 
< 0.1%
1100049926
< 0.1%
110005699
 
< 0.1%
1100064172
 
< 0.1%
110007229
 
< 0.1%
110008055
 
< 0.1%
1100098240
 
< 0.1%
1100106346
 
< 0.1%
ValueCountFrequency (%)
530010821075
0.4%
522230215
 
< 0.1%
522220318
 
< 0.1%
522205428
 
< 0.1%
522200562
 
< 0.1%
522190814
 
< 0.1%
5221858933
 
< 0.1%
522180985
 
< 0.1%
522170086
 
< 0.1%
5221601264
 
< 0.1%

NO_MUNICIPIO_ESC
Categorical

HIGH CARDINALITY
MISSING

Distinct5277
Distinct (%)0.5%
Missing3947858
Missing (%)77.5%
Memory size38.9 MiB
São Paulo
 
52736
Fortaleza
 
29738
Rio de Janeiro
 
27441
Brasília
 
21075
Manaus
 
16141
Other values (5272)
1000281 

Length

Max length32
Median length9
Mean length10.1963314
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)< 0.1%

Sample

1st rowUberlândia
2nd rowApucarana
3rd rowAlvorada
4th rowMogi das Cruzes
5th rowLauro de Freitas

Common Values

ValueCountFrequency (%)
São Paulo52736
 
1.0%
Fortaleza29738
 
0.6%
Rio de Janeiro27441
 
0.5%
Brasília21075
 
0.4%
Manaus16141
 
0.3%
Belo Horizonte14396
 
0.3%
Salvador13330
 
0.3%
Recife12181
 
0.2%
Belém12089
 
0.2%
Curitiba11580
 
0.2%
Other values (5267)936705
 
18.4%
(Missing)3947858
77.5%

Length

2022-04-14T10:32:55.950658image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
são117770
 
6.3%
do71375
 
3.8%
de70253
 
3.8%
paulo54655
 
2.9%
rio45007
 
2.4%
fortaleza29977
 
1.6%
janeiro27441
 
1.5%
brasília21254
 
1.1%
grande18042
 
1.0%
da17548
 
0.9%
Other values (3949)1391073
74.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

CO_UF_ESC
Real number (ℝ≥0)

MISSING

Distinct27
Distinct (%)< 0.1%
Missing3947858
Missing (%)77.5%
Infinite0
Infinite (%)0.0%
Mean31.13281977
Minimum11
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.9 MiB
2022-04-14T10:32:56.161079image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile15
Q123
median31
Q335
95-th percentile51
Maximum53
Range42
Interquartile range (IQR)12

Descriptive statistics

Standard deviation9.665164065
Coefficient of variation (CV)0.3104493629
Kurtosis-0.09646923525
Mean31.13281977
Median Absolute Deviation (MAD)5
Skewness0.2724421361
Sum35722171
Variance93.4153964
MonotonicityNot monotonic
2022-04-14T10:32:56.376817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
35219426
 
4.3%
23112313
 
2.2%
31108173
 
2.1%
3373086
 
1.4%
2970328
 
1.4%
2661506
 
1.2%
4152788
 
1.0%
1546742
 
0.9%
4346173
 
0.9%
2143497
 
0.9%
Other values (17)313380
 
6.2%
(Missing)3947858
77.5%
ValueCountFrequency (%)
1112069
 
0.2%
125330
 
0.1%
1325817
 
0.5%
142203
 
< 0.1%
1546742
0.9%
166136
 
0.1%
179087
 
0.2%
2143497
 
0.9%
2224156
 
0.5%
23112313
2.2%
ValueCountFrequency (%)
5321075
 
0.4%
5235546
 
0.7%
5118022
 
0.4%
5015539
 
0.3%
4346173
 
0.9%
4234646
 
0.7%
4152788
 
1.0%
35219426
4.3%
3373086
 
1.4%
3225422
 
0.5%

SG_UF_ESC
Categorical

HIGH CORRELATION
MISSING

Distinct27
Distinct (%)< 0.1%
Missing3947858
Missing (%)77.5%
Memory size38.9 MiB
SP
219426 
CE
112313 
MG
108173 
RJ
73086 
BA
70328 
Other values (22)
564086 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMG
2nd rowPR
3rd rowRS
4th rowSP
5th rowBA

Common Values

ValueCountFrequency (%)
SP219426
 
4.3%
CE112313
 
2.2%
MG108173
 
2.1%
RJ73086
 
1.4%
BA70328
 
1.4%
PE61506
 
1.2%
PR52788
 
1.0%
PA46742
 
0.9%
RS46173
 
0.9%
MA43497
 
0.9%
Other values (17)313380
 
6.2%
(Missing)3947858
77.5%

Length

2022-04-14T10:32:56.579069image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sp219426
19.1%
ce112313
 
9.8%
mg108173
 
9.4%
rj73086
 
6.4%
ba70328
 
6.1%
pe61506
 
5.4%
pr52788
 
4.6%
pa46742
 
4.1%
rs46173
 
4.0%
ma43497
 
3.8%
Other values (17)313380
27.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TP_DEPENDENCIA_ADM_ESC
Categorical

HIGH CORRELATION
MISSING

Distinct4
Distinct (%)< 0.1%
Missing3947858
Missing (%)77.5%
Memory size38.9 MiB
2.0
924338 
4.0
164333 
1.0
 
49091
3.0
 
9650

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row4.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.0924338
 
18.1%
4.0164333
 
3.2%
1.049091
 
1.0%
3.09650
 
0.2%
(Missing)3947858
77.5%

Length

2022-04-14T10:32:56.758921image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:56.895195image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
2.0924338
80.6%
4.0164333
 
14.3%
1.049091
 
4.3%
3.09650
 
0.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TP_LOCALIZACAO_ESC
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing3947858
Missing (%)77.5%
Memory size38.9 MiB
1.0
1105871 
2.0
 
41541

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.01105871
 
21.7%
2.041541
 
0.8%
(Missing)3947858
77.5%

Length

2022-04-14T10:32:57.037795image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:57.359421image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.01105871
96.4%
2.041541
 
3.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TP_SIT_FUNC_ESC
Categorical

HIGH CORRELATION
MISSING

Distinct4
Distinct (%)< 0.1%
Missing3947858
Missing (%)77.5%
Memory size38.9 MiB
1.0
1144363 
4.0
 
1593
2.0
 
1258
3.0
 
198

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.01144363
 
22.5%
4.01593
 
< 0.1%
2.01258
 
< 0.1%
3.0198
 
< 0.1%
(Missing)3947858
77.5%

Length

2022-04-14T10:32:57.491299image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:57.647941image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.01144363
99.7%
4.01593
 
0.1%
2.01258
 
0.1%
3.0198
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

CO_MUNICIPIO_PROVA
Real number (ℝ≥0)

Distinct1727
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3088260.601
Minimum1100015
Maximum5300108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.9 MiB
2022-04-14T10:32:57.808358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1100015
5-th percentile1500800
Q12408003
median3125101
Q33549805
95-th percentile5201108
Maximum5300108
Range4200093
Interquartile range (IQR)1141802

Descriptive statistics

Standard deviation997757.953
Coefficient of variation (CV)0.3230808802
Kurtosis-0.1531684341
Mean3088260.601
Median Absolute Deviation (MAD)518901
Skewness0.2713466131
Sum1.573552159 × 1013
Variance9.955209329 × 1011
MonotonicityNot monotonic
2022-04-14T10:32:58.092399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3550308216016
 
4.2%
3304557134779
 
2.6%
530010895860
 
1.9%
230440095089
 
1.9%
292740887894
 
1.7%
310620087365
 
1.7%
130260385435
 
1.7%
150140279512
 
1.6%
211130074941
 
1.5%
261160662377
 
1.2%
Other values (1717)4076002
80.0%
ValueCountFrequency (%)
1100015549
 
< 0.1%
11000233755
0.1%
11000494219
0.1%
1100056475
 
< 0.1%
1100064496
 
< 0.1%
1100098827
 
< 0.1%
11001062577
0.1%
11001141636
 
< 0.1%
11001224849
0.1%
1100130558
 
< 0.1%
ValueCountFrequency (%)
530010895860
1.9%
52218585314
 
0.1%
52216011920
 
< 0.1%
52214034607
 
0.1%
52206031026
 
< 0.1%
52204543213
 
0.1%
5220207912
 
< 0.1%
52201081715
 
< 0.1%
52197531395
 
< 0.1%
5219308804
 
< 0.1%

NO_MUNICIPIO_PROVA
Categorical

HIGH CARDINALITY

Distinct1692
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
São Paulo
 
216016
Rio de Janeiro
 
134779
Brasília
 
95860
Fortaleza
 
95089
Salvador
 
87894
Other values (1687)
4465632 

Length

Max length28
Median length9
Mean length9.959621767
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSorocaba
2nd rowFeira de Santana
3rd rowFortaleza
4th rowPalmas
5th rowContagem

Common Values

ValueCountFrequency (%)
São Paulo216016
 
4.2%
Rio de Janeiro134779
 
2.6%
Brasília95860
 
1.9%
Fortaleza95089
 
1.9%
Salvador87894
 
1.7%
Belo Horizonte87365
 
1.7%
Manaus85435
 
1.7%
Belém79512
 
1.6%
São Luís74941
 
1.5%
Recife62377
 
1.2%
Other values (1682)4076002
80.0%

Length

2022-04-14T10:32:58.346120image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
são517467
 
6.4%
de324827
 
4.0%
do250486
 
3.1%
rio222982
 
2.7%
paulo220909
 
2.7%
janeiro134779
 
1.7%
brasília97395
 
1.2%
fortaleza95089
 
1.2%
belo92077
 
1.1%
horizonte90712
 
1.1%
Other values (1608)6083619
74.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

CO_UF_PROVA
Real number (ℝ≥0)

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.7330587
Minimum11
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.9 MiB
2022-04-14T10:32:58.534214image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile15
Q124
median31
Q335
95-th percentile52
Maximum53
Range42
Interquartile range (IQR)11

Descriptive statistics

Standard deviation9.947315641
Coefficient of variation (CV)0.3236682602
Kurtosis-0.1197349961
Mean30.7330587
Median Absolute Deviation (MAD)5
Skewness0.298780704
Sum156593232
Variance98.94908846
MonotonicityNot monotonic
2022-04-14T10:32:58.701004image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
35815992
16.0%
31534627
 
10.5%
29395436
 
7.8%
33339718
 
6.7%
23294984
 
5.8%
15279596
 
5.5%
26275318
 
5.4%
43218472
 
4.3%
21218076
 
4.3%
41211306
 
4.1%
Other values (17)1511745
29.7%
ValueCountFrequency (%)
1158639
 
1.2%
1238645
 
0.8%
13118146
2.3%
1412958
 
0.3%
15279596
5.5%
1642636
 
0.8%
1746249
 
0.9%
21218076
4.3%
22122336
2.4%
23294984
5.8%
ValueCountFrequency (%)
5395860
 
1.9%
52169983
 
3.3%
5188121
 
1.7%
5070393
 
1.4%
43218472
 
4.3%
42110699
 
2.2%
41211306
 
4.1%
35815992
16.0%
33339718
6.7%
32102273
 
2.0%

SG_UF_PROVA
Categorical

HIGH CORRELATION

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
SP
815992 
MG
534627 
BA
395436 
RJ
339718 
CE
294984 
Other values (22)
2714513 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSP
2nd rowBA
3rd rowCE
4th rowTO
5th rowMG

Common Values

ValueCountFrequency (%)
SP815992
16.0%
MG534627
 
10.5%
BA395436
 
7.8%
RJ339718
 
6.7%
CE294984
 
5.8%
PA279596
 
5.5%
PE275318
 
5.4%
RS218472
 
4.3%
MA218076
 
4.3%
PR211306
 
4.1%
Other values (17)1511745
29.7%

Length

2022-04-14T10:32:58.880107image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sp815992
16.0%
mg534627
 
10.5%
ba395436
 
7.8%
rj339718
 
6.7%
ce294984
 
5.8%
pa279596
 
5.5%
pe275318
 
5.4%
rs218472
 
4.3%
ma218076
 
4.3%
pr211306
 
4.1%
Other values (17)1511745
29.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q001
Categorical

HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
E
1308991 
B
1135816 
C
723518 
D
586620 
H
444806 
Other values (3)
895519 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowC
3rd rowE
4th rowC
5th rowE

Common Values

ValueCountFrequency (%)
E1308991
25.7%
B1135816
22.3%
C723518
14.2%
D586620
11.5%
H444806
 
8.7%
F375721
 
7.4%
A290615
 
5.7%
G229183
 
4.5%

Length

2022-04-14T10:32:59.036542image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:59.160075image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
e1308991
25.7%
b1135816
22.3%
c723518
14.2%
d586620
11.5%
h444806
 
8.7%
f375721
 
7.4%
a290615
 
5.7%
g229183
 
4.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q002
Categorical

HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
E
1633692 
B
854826 
C
686198 
D
667415 
F
513711 
Other values (3)
739428 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowE
3rd rowE
4th rowE
5th rowE

Common Values

ValueCountFrequency (%)
E1633692
32.1%
B854826
16.8%
C686198
13.5%
D667415
13.1%
F513711
 
10.1%
G399470
 
7.8%
A194625
 
3.8%
H145333
 
2.9%

Length

2022-04-14T10:32:59.303734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:59.425426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
e1633692
32.1%
b854826
16.8%
c686198
13.5%
d667415
13.1%
f513711
 
10.1%
g399470
 
7.8%
a194625
 
3.8%
h145333
 
2.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q003
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
C
1203182 
A
1092542 
B
1079077 
D
865524 
F
575122 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowB
3rd rowB
4th rowB
5th rowC

Common Values

ValueCountFrequency (%)
C1203182
23.6%
A1092542
21.4%
B1079077
21.2%
D865524
17.0%
F575122
11.3%
E279823
 
5.5%

Length

2022-04-14T10:32:59.562810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:59.700841image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
c1203182
23.6%
a1092542
21.4%
b1079077
21.2%
d865524
17.0%
f575122
11.3%
e279823
 
5.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q004
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
B
2204272 
D
1023529 
A
889104 
F
437465 
C
349395 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowB
3rd rowC
4th rowD
5th rowC

Common Values

ValueCountFrequency (%)
B2204272
43.3%
D1023529
20.1%
A889104
17.4%
F437465
 
8.6%
C349395
 
6.9%
E191505
 
3.8%

Length

2022-04-14T10:32:59.830626image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:32:59.951273image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
b2204272
43.3%
d1023529
20.1%
a889104
17.4%
f437465
 
8.6%
c349395
 
6.9%
e191505
 
3.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q005
Real number (ℝ≥0)

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.868090013
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.9 MiB
2022-04-14T10:33:00.077348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median4
Q35
95-th percentile6
Maximum20
Range19
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.480990442
Coefficient of variation (CV)0.382873831
Kurtosis5.400695987
Mean3.868090013
Median Absolute Deviation (MAD)1
Skewness1.21053676
Sum19708963
Variance2.193332689
MonotonicityNot monotonic
2022-04-14T10:33:00.253754image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
41639448
32.2%
31299838
25.5%
5830392
16.3%
2645557
 
12.7%
6306841
 
6.0%
1142014
 
2.8%
7128880
 
2.5%
855546
 
1.1%
920920
 
0.4%
1014192
 
0.3%
Other values (10)11642
 
0.2%
ValueCountFrequency (%)
1142014
 
2.8%
2645557
 
12.7%
31299838
25.5%
41639448
32.2%
5830392
16.3%
6306841
 
6.0%
7128880
 
2.5%
855546
 
1.1%
920920
 
0.4%
1014192
 
0.3%
ValueCountFrequency (%)
20572
 
< 0.1%
19132
 
< 0.1%
18159
 
< 0.1%
17183
 
< 0.1%
16331
 
< 0.1%
15619
 
< 0.1%
14749
 
< 0.1%
131512
 
< 0.1%
122916
0.1%
114469
0.1%

Q006
Categorical

HIGH CORRELATION

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
C
1311465 
B
1273250 
D
499616 
E
475568 
G
301922 
Other values (12)
1233449 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowC
3rd rowB
4th rowE
5th rowG

Common Values

ValueCountFrequency (%)
C1311465
25.7%
B1273250
25.0%
D499616
 
9.8%
E475568
 
9.3%
G301922
 
5.9%
A232075
 
4.6%
F226251
 
4.4%
H192912
 
3.8%
I143652
 
2.8%
J80789
 
1.6%
Other values (7)357770
 
7.0%

Length

2022-04-14T10:33:00.440661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c1311465
25.7%
b1273250
25.0%
d499616
 
9.8%
e475568
 
9.3%
g301922
 
5.9%
a232075
 
4.6%
f226251
 
4.4%
h192912
 
3.8%
i143652
 
2.8%
j80789
 
1.6%
Other values (7)357770
 
7.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q007
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
4725052 
B
 
197936
D
 
130163
C
 
42119

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A4725052
92.7%
B197936
 
3.9%
D130163
 
2.6%
C42119
 
0.8%

Length

2022-04-14T10:33:00.597287image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:00.724111image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a4725052
92.7%
b197936
 
3.9%
d130163
 
2.6%
c42119
 
0.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q008
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
B
3600387 
C
1029646 
D
 
274380
E
 
141446
A
 
49411

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowB
3rd rowB
4th rowC
5th rowC

Common Values

ValueCountFrequency (%)
B3600387
70.7%
C1029646
 
20.2%
D274380
 
5.4%
E141446
 
2.8%
A49411
 
1.0%

Length

2022-04-14T10:33:01.012315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:01.131865image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
b3600387
70.7%
c1029646
 
20.2%
d274380
 
5.4%
e141446
 
2.8%
a49411
 
1.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q009
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
C
2576656 
D
1534163 
B
697386 
E
 
246169
A
 
40896

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowC
3rd rowB
4th rowC
5th rowD

Common Values

ValueCountFrequency (%)
C2576656
50.6%
D1534163
30.1%
B697386
 
13.7%
E246169
 
4.8%
A40896
 
0.8%

Length

2022-04-14T10:33:01.248724image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:01.351406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
c2576656
50.6%
d1534163
30.1%
b697386
 
13.7%
e246169
 
4.8%
a40896
 
0.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q010
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
2897668 
B
1757786 
C
375812 
D
 
52511
E
 
11493

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowB

Common Values

ValueCountFrequency (%)
A2897668
56.9%
B1757786
34.5%
C375812
 
7.4%
D52511
 
1.0%
E11493
 
0.2%

Length

2022-04-14T10:33:01.469877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:01.581704image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a2897668
56.9%
b1757786
34.5%
c375812
 
7.4%
d52511
 
1.0%
e11493
 
0.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q011
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
3953236 
B
1032706 
C
 
98012
D
 
9538
E
 
1778

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowB
5th rowA

Common Values

ValueCountFrequency (%)
A3953236
77.6%
B1032706
 
20.3%
C98012
 
1.9%
D9538
 
0.2%
E1778
 
< 0.1%

Length

2022-04-14T10:33:01.712830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:01.830578image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a3953236
77.6%
b1032706
 
20.3%
c98012
 
1.9%
d9538
 
0.2%
e1778
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q012
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
B
4794247 
C
 
196882
A
 
87415
D
 
13969
E
 
2757

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowB
3rd rowB
4th rowB
5th rowB

Common Values

ValueCountFrequency (%)
B4794247
94.1%
C196882
 
3.9%
A87415
 
1.7%
D13969
 
0.3%
E2757
 
0.1%

Length

2022-04-14T10:33:01.960877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:02.085369image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
b4794247
94.1%
c196882
 
3.9%
a87415
 
1.7%
d13969
 
0.3%
e2757
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q013
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
3262348 
B
1708106 
C
 
108667
D
 
13020
E
 
3129

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowB

Common Values

ValueCountFrequency (%)
A3262348
64.0%
B1708106
33.5%
C108667
 
2.1%
D13020
 
0.3%
E3129
 
0.1%

Length

2022-04-14T10:33:02.224219image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:02.395713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a3262348
64.0%
b1708106
33.5%
c108667
 
2.1%
d13020
 
0.3%
e3129
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q014
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
B
2962279 
A
2087266 
C
 
43504
D
 
1737
E
 
484

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowB
5th rowB

Common Values

ValueCountFrequency (%)
B2962279
58.1%
A2087266
41.0%
C43504
 
0.9%
D1737
 
< 0.1%
E484
 
< 0.1%

Length

2022-04-14T10:33:02.539250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:02.676617image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
b2962279
58.1%
a2087266
41.0%
c43504
 
0.9%
d1737
 
< 0.1%
e484
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q015
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
4515899 
B
572381 
C
 
6135
D
 
528
E
 
327

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A4515899
88.6%
B572381
 
11.2%
C6135
 
0.1%
D528
 
< 0.1%
E327
 
< 0.1%

Length

2022-04-14T10:33:02.807952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:02.927714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a4515899
88.6%
b572381
 
11.2%
c6135
 
0.1%
d528
 
< 0.1%
e327
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q016
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
2591648 
B
2477890 
C
 
24114
D
 
1213
E
 
405

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowB

Common Values

ValueCountFrequency (%)
A2591648
50.9%
B2477890
48.6%
C24114
 
0.5%
D1213
 
< 0.1%
E405
 
< 0.1%

Length

2022-04-14T10:33:03.046441image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:03.150967image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a2591648
50.9%
b2477890
48.6%
c24114
 
0.5%
d1213
 
< 0.1%
e405
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q017
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
4970640 
B
 
122228
C
 
1720
D
 
389
E
 
293

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A4970640
97.6%
B122228
 
2.4%
C1720
 
< 0.1%
D389
 
< 0.1%
E293
 
< 0.1%

Length

2022-04-14T10:33:03.273469image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:03.377913image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a4970640
97.6%
b122228
 
2.4%
c1720
 
< 0.1%
d389
 
< 0.1%
e293
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q018
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
4122588 
B
972682 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A4122588
80.9%
B972682
 
19.1%

Length

2022-04-14T10:33:03.499285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:03.596361image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a4122588
80.9%
b972682
 
19.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q019
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
B
3519448 
C
909125 
D
 
286892
A
 
251152
E
 
128653

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowB
3rd rowB
4th rowC
5th rowB

Common Values

ValueCountFrequency (%)
B3519448
69.1%
C909125
 
17.8%
D286892
 
5.6%
A251152
 
4.9%
E128653
 
2.5%

Length

2022-04-14T10:33:03.691557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:03.813018image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
b3519448
69.1%
c909125
 
17.8%
d286892
 
5.6%
a251152
 
4.9%
e128653
 
2.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q020
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
3546388 
B
1548882 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A3546388
69.6%
B1548882
30.4%

Length

2022-04-14T10:33:03.949013image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:04.236891image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a3546388
69.6%
b1548882
30.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q021
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
3982985 
B
1112285 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowB
5th rowA

Common Values

ValueCountFrequency (%)
A3982985
78.2%
B1112285
 
21.8%

Length

2022-04-14T10:33:04.335011image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:04.440321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a3982985
78.2%
b1112285
 
21.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q022
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
C
1595029 
D
1386256 
B
1085417 
E
919229 
A
 
109339

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowE
3rd rowC
4th rowD
5th rowC

Common Values

ValueCountFrequency (%)
C1595029
31.3%
D1386256
27.2%
B1085417
21.3%
E919229
18.0%
A109339
 
2.1%

Length

2022-04-14T10:33:04.534647image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:04.648551image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
c1595029
31.3%
d1386256
27.2%
b1085417
21.3%
e919229
18.0%
a109339
 
2.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q023
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
3891775 
B
1203495 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowB

Common Values

ValueCountFrequency (%)
A3891775
76.4%
B1203495
 
23.6%

Length

2022-04-14T10:33:04.771442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:04.875034image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a3891775
76.4%
b1203495
 
23.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q024
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
A
2345467 
B
2202984 
C
384607 
D
 
114655
E
 
47557

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowA
3rd rowA
4th rowB
5th rowB

Common Values

ValueCountFrequency (%)
A2345467
46.0%
B2202984
43.2%
C384607
 
7.5%
D114655
 
2.3%
E47557
 
0.9%

Length

2022-04-14T10:33:04.972043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:05.098109image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a2345467
46.0%
b2202984
43.2%
c384607
 
7.5%
d114655
 
2.3%
e47557
 
0.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Q025
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
B
3954805 
A
1140465 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowB
3rd rowB
4th rowB
5th rowB

Common Values

ValueCountFrequency (%)
B3954805
77.6%
A1140465
 
22.4%

Length

2022-04-14T10:33:05.220358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:05.332987image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
b3954805
77.6%
a1140465
 
22.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

falta
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.9 MiB
0
3704302 
1
1390968 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
03704302
72.7%
11390968
 
27.3%

Length

2022-04-14T10:33:05.443420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-14T10:33:05.546754image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
03704302
72.7%
11390968
 
27.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-04-14T10:25:51.566411image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:24:52.250836image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:20.924707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:25.271852image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:31.599663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:40.662235image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:52.251811image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:12.491013image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:21.751282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:25.968395image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:32.271795image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:41.469925image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:52.937908image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:13.701501image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:22.404123image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:26.624849image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:32.945289image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:42.127714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:54.897387image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:16.188725image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:23.110962image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:27.286139image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:34.873968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:44.163958image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:56.794363image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:18.135130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:23.834621image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:28.046917image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:36.761513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:46.019308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:58.622331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:20.107549image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:24.560364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:28.872994image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:38.655985image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-14T10:25:47.905569image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-04-14T10:33:05.716751image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-04-14T10:26:07.241014image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-04-14T10:26:55.750815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-04-14T10:31:55.484389image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-04-14T10:32:14.617286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

NU_ANOTP_SEXOTP_ESTADO_CIVILTP_COR_RACATP_NACIONALIDADETP_ST_CONCLUSAOTP_ESCOLATP_ENSINOIN_TREINEIROCO_MUNICIPIO_ESCNO_MUNICIPIO_ESCCO_UF_ESCSG_UF_ESCTP_DEPENDENCIA_ADM_ESCTP_LOCALIZACAO_ESCTP_SIT_FUNC_ESCCO_MUNICIPIO_PROVANO_MUNICIPIO_PROVACO_UF_PROVASG_UF_PROVAQ001Q002Q003Q004Q005Q006Q007Q008Q009Q010Q011Q012Q013Q014Q015Q016Q017Q018Q019Q020Q021Q022Q023Q024Q025falta
02019M13111NaN0NaNNaNNaNNaNNaNNaNNaN3552205Sorocaba35SPBBAA2AABCAABAAAAAAAAACACB1
12019M111111.00NaNNaNNaNNaNNaNNaNNaN2910800Feira de Santana29BACEBB4CABCAABAAAAAABAAEAAB1
22019F131111.00NaNNaNNaNNaNNaNNaNNaN2304400Fortaleza23CEEEBC2BABBAABAAAAAABAACAAB1
32019F11111NaN0NaNNaNNaNNaNNaNNaNNaN1721000Palmas17TOCEBD4EACCABBABAAAACABDABB0
42019F111111.00NaNNaNNaNNaNNaNNaNNaN3118601Contagem31MGEECC3GACDBABBBABAABAACBBB0
52019M22111NaN0NaNNaNNaNNaNNaNNaNNaN5107602Rondonópolis51MTBBCB7EABDACBABABAABAAEABB0
62019F131111.00NaNNaNNaNNaNNaNNaNNaN2924009Paulo Afonso29BAEBCB3BABBAABAAAAAABAABAAA0
72019M13111NaN0NaNNaNNaNNaNNaNNaNNaN3550308São Paulo35SPEBBB2DABCAABBBABAABAACAAB1
82019F111111.00NaNNaNNaNNaNNaNNaNNaN3550308São Paulo35SPBCCF6CABCAABBAABAABAABBAA1
92019M111221.00NaNNaNNaNNaNNaNNaNNaN3550308São Paulo35SPEEBB5FABCBABBBABAABABCBBB0

Last rows

NU_ANOTP_SEXOTP_ESTADO_CIVILTP_COR_RACATP_NACIONALIDADETP_ST_CONCLUSAOTP_ESCOLATP_ENSINOIN_TREINEIROCO_MUNICIPIO_ESCNO_MUNICIPIO_ESCCO_UF_ESCSG_UF_ESCTP_DEPENDENCIA_ADM_ESCTP_LOCALIZACAO_ESCTP_SIT_FUNC_ESCCO_MUNICIPIO_PROVANO_MUNICIPIO_PROVACO_UF_PROVASG_UF_PROVAQ001Q002Q003Q004Q005Q006Q007Q008Q009Q010Q011Q012Q013Q014Q015Q016Q017Q018Q019Q020Q021Q022Q023Q024Q025falta
50952602019F111221.00NaNNaNNaNNaNNaNNaNNaN3550308São Paulo35SPEEBB3BABBAABAAAAAABAABAAA0
50952612019F031111.00NaNNaNNaNNaNNaNNaNNaN5008305Três Lagoas50MSHGFF4BAABAAAAAAAAABAABAAA1
50952622019F21111NaN0NaNNaNNaNNaNNaNNaNNaN3304557Rio de Janeiro33RJCBCB2EABBBABABABABBAACABB1
50952632019F23111NaN0NaNNaNNaNNaNNaNNaNNaN1501303Barcarena15PAHEFB3EABCAABBAAAAABBACABA0
50952642019F131221.00NaNNaNNaNNaNNaNNaNNaN3534401Osasco35SPHECF3DABCBABBBABAACBADABB0
50952652019F43111NaN0NaNNaNNaNNaNNaNNaNNaN5300108Brasília53DFEBCF4CABCAABABBBAABAABBAB1
50952662019M13111NaN0NaNNaNNaNNaNNaNNaNNaN1302603Manaus13AMCFAD7CABCAABABBAAABBAEAAB1
50952672019M00111NaN0NaNNaNNaNNaNNaNNaNNaN4302105Bento Gonçalves43RSEEDD3CABCBABABAAAABAACBBB1
50952682019M00111NaN0NaNNaNNaNNaNNaNNaNNaN4302105Bento Gonçalves43RSHCFF3AABCAABBBABAACBBAAAB1
50952692019F00111NaN0NaNNaNNaNNaNNaNNaNNaN4302105Bento Gonçalves43RSHHFF3CABCBABABAAAABAABBBB1

Duplicate rows

Most frequently occurring

NU_ANOTP_SEXOTP_ESTADO_CIVILTP_COR_RACATP_NACIONALIDADETP_ST_CONCLUSAOTP_ESCOLATP_ENSINOIN_TREINEIROCO_MUNICIPIO_ESCNO_MUNICIPIO_ESCCO_UF_ESCSG_UF_ESCTP_DEPENDENCIA_ADM_ESCTP_LOCALIZACAO_ESCTP_SIT_FUNC_ESCCO_MUNICIPIO_PROVANO_MUNICIPIO_PROVACO_UF_PROVASG_UF_PROVAQ001Q002Q003Q004Q005Q006Q007Q008Q009Q010Q011Q012Q013Q014Q015Q016Q017Q018Q019Q020Q021Q022Q023Q024Q025falta# duplicates
53382019M131221.002312403.0São Gonçalo do Amarante23.0CE2.01.01.02312403São Gonçalo do Amarante23CEDDBB5BABCAABAAAAAABAABAAA0104
25752019F131221.002312403.0São Gonçalo do Amarante23.0CE2.01.01.02312403São Gonçalo do Amarante23CEDDBB5BABCAABAAAAAABAABAAA087
15692019F131221.002303709.0Caucaia23.0CE2.01.01.02303709Caucaia23CEBBAB4BABBAABAAAAAABAABAAA071
44552019M131221.002303709.0Caucaia23.0CE2.01.01.02303709Caucaia23CEBBAB4BABBAABAAAAAABAABAAA068
48332019M131221.002304954.0Guaiúba23.0CE2.01.01.02309706Pacatuba23CEBBAA5BABBAABAAAAAABAABAAA055
20482019F131221.002304954.0Guaiúba23.0CE2.01.01.02309706Pacatuba23CEBBAA5BABBAABAAAAAABAABAAA054
34052019M001221.002302206.0Beberibe23.0CE2.01.01.02302206Beberibe23CEHHFF4CABBAABAAAAAABAABAAB051
48212019M131221.002304707.0Granja23.0CE2.01.01.02304707Granja23CEDDAA4AABBAABAAAAAABAAAAAA051
19182019F131221.002304400.0Fortaleza23.0CE2.01.01.02304400Fortaleza23CEHHBB3BABBAABAAAAAABAABAAA047
92019F001221.002302206.0Beberibe23.0CE2.01.01.02302206Beberibe23CEHHFF4CABBAABAAAAAABAABAAB046